859 research outputs found

    Proximal boosting and its acceleration

    Full text link
    Gradient boosting is a prediction method that iteratively combines weak learners to produce a complex and accurate model. From an optimization point of view, the learning procedure of gradient boosting mimics a gradient descent on a functional variable. This paper proposes to build upon the proximal point algorithm when the empirical risk to minimize is not differentiable to introduce a novel boosting approach, called proximal boosting. Besides being motivated by non-differentiable optimization, the proposed algorithm benefits from Nesterov's acceleration in the same way as gradient boosting [Biau et al., 2018]. This leads to a variant, called accelerated proximal boosting. Advantages of leveraging proximal methods for boosting are illustrated by numerical experiments on simulated and real-world data. In particular, we exhibit a favorable comparison over gradient boosting regarding convergence rate and prediction accuracy

    Robust Lasso-Zero for sparse corruption and model selection with missing covariates

    Full text link
    We propose Robust Lasso-Zero, an extension of the Lasso-Zero methodology [Descloux and Sardy, 2018], initially introduced for sparse linear models, to the sparse corruptions problem. We give theoretical guarantees on the sign recovery of the parameters for a slightly simplified version of the estimator, called Thresholded Justice Pursuit. The use of Robust Lasso-Zero is showcased for variable selection with missing values in the covariates. In addition to not requiring the specification of a model for the covariates, nor estimating their covariance matrix or the noise variance, the method has the great advantage of handling missing not-at random values without specifying a parametric model. Numerical experiments and a medical application underline the relevance of Robust Lasso-Zero in such a context with few available competitors. The method is easy to use and implemented in the R library lass0

    On the asymptotic rate of convergence of Stochastic Newton algorithms and their Weighted Averaged versions

    Full text link
    The majority of machine learning methods can be regarded as the minimization of an unavailable risk function. To optimize the latter, given samples provided in a streaming fashion, we define a general stochastic Newton algorithm and its weighted average version. In several use cases, both implementations will be shown not to require the inversion of a Hessian estimate at each iteration, but a direct update of the estimate of the inverse Hessian instead will be favored. This generalizes a trick introduced in [2] for the specific case of logistic regression, by directly updating the estimate of the inverse Hessian. Under mild assumptions such as local strong convexity at the optimum, we establish almost sure convergences and rates of convergence of the algorithms, as well as central limit theorems for the constructed parameter estimates. The unified framework considered in this paper covers the case of linear, logistic or softmax regressions to name a few. Numerical experiments on simulated data give the empirical evidence of the pertinence of the proposed methods, which outperform popular competitors particularly in case of bad initializa-tions.Comment: Computational Optimization and Applications, 202

    Sampling by blocks of measurements in compressed sensing

    Get PDF
    Various acquisition devices impose sampling blocks of measurements. A typical example is parallel magnetic resonance imaging (MRI) where several radio-frequency coils simultaneously acquire a set of Fourier modulated coefficients. We study a new random sampling approach that consists in selecting a set of blocks that are predefined by the application of interest. We provide theoretical results on the number of blocks that are required for exact sparse signal reconstruction. We finish by illustrating these results on various examples, and discuss their connection to the literature on CS

    Convergence and error analysis of PINNs

    Full text link
    Physics-informed neural networks (PINNs) are a promising approach that combines the power of neural networks with the interpretability of physical modeling. PINNs have shown good practical performance in solving partial differential equations (PDEs) and in hybrid modeling scenarios, where physical models enhance data-driven approaches. However, it is essential to establish their theoretical properties in order to fully understand their capabilities and limitations. In this study, we highlight that classical training of PINNs can suffer from systematic overfitting. This problem can be addressed by adding a ridge regularization to the empirical risk, which ensures that the resulting estimator is risk-consistent for both linear and nonlinear PDE systems. However, the strong convergence of PINNs to a solution satisfying the physical constraints requires a more involved analysis using tools from functional analysis and calculus of variations. In particular, for linear PDE systems, an implementable Sobolev-type regularization allows to reconstruct a solution that not only achieves statistical accuracy but also maintains consistency with the underlying physics

    Block-constrained compressed sensing

    Get PDF
    Dans cette thèse, nous visons à combiner les théories d'échantillonnage compressé (CS) avec une structure d'acquisition par blocs de mesures. D'une part, nous obtenons des résultats théoriques de CS avec contraintes d'acquisition par blocs, pour la reconstruction de tout vecteur s-parcimonieux et pour la reconstruction d'un vecteur x de support S fixé. Nous montrons que l'acquisition structurée peut donner de bons résultats de reconstruction théoriques, à condition que le signal à reconstruire présente une structure de parcimonie, adaptée aux contraintes d'échantillonnage. D'autre part, nous proposons des méthodes numériques pour générer des schémas d'échantillonnage efficaces reposant sur des blocs de mesures. Ces méthodes s'appuient sur des techniques de projection de mesure de probabilité.This PhD. thesis is dedicated to combine compressed sensing with block structured acquisition. In the first part of this work, theoretical CS results are derived with blocks acquisition constraints, for the recovery of any s-sparse signal and for the recovery of a vector with a given support S.We show that structured acquisition can be successfully used in a CS framework, provided that the signal to reconstruct presents an additional structure in its sparsity, adapted to the sampling constraints.In the second part of this work, we propose numerical methods to generate efficient block sampling schemes. This approach relies on the measure projection on admissible measures

    HYR2PICS: Hybrid Regularized Reconstruction for combined Parallel Imaging and Compressive Sensing in MRI

    Get PDF
    International audienceBoth parallel Magnetic Resonance Imaging~(pMRI) and Compressed Sensing (CS) are emerging techniques to accelerate conventional MRI by reducing the number of acquired data in the kk-space. So far, first attempts to combine sensitivity encoding (SENSE) imaging in pMRI with CS have been proposed in the context of Cartesian trajectories. Here, we extend these approaches to non-Cartesian trajectories by jointly formulating the CS and SENSE recovery in a hybrid Fourier/wavelet framework and optimizing a convex but nonsmooth criterion. On anatomical MRI data, we show that HYR2^2PICS outperforms wavelet-based regularized SENSE reconstruction. Our results are also in agreement with the Transform Point Spread Function (TPSF) criterion that measures the degree of incoherence of kk-space undersampling schemes

    Missing Data Imputation using Optimal Transport

    Full text link
    Missing data is a crucial issue when applying machine learning algorithms to real-world datasets. Starting from the simple assumption that two batches extracted randomly from the same dataset should share the same distribution, we leverage optimal transport distances to quantify that criterion and turn it into a loss function to impute missing data values. We propose practical methods to minimize these losses using end-to-end learning, that can exploit or not parametric assumptions on the underlying distributions of values. We evaluate our methods on datasets from the UCI repository, in MCAR, MAR and MNAR settings. These experiments show that OT-based methods match or out-perform state-of-the-art imputation methods, even for high percentages of missing values
    corecore